Overview

Dataset statistics

Number of variables17
Number of observations2111
Missing cells0
Missing cells (%)0.0%
Duplicate rows9
Duplicate rows (%)0.4%
Total size in memory280.5 KiB
Average record size in memory136.1 B

Variable types

Categorical5
Numeric8
Boolean4

Alerts

Dataset has 9 (0.4%) duplicate rowsDuplicates
Gender is highly overall correlated with Height and 1 other fieldsHigh correlation
Height is highly overall correlated with GenderHigh correlation
NObeyesdad is highly overall correlated with Gender and 2 other fieldsHigh correlation
Weight is highly overall correlated with NObeyesdad and 1 other fieldsHigh correlation
family_history_with_overweight is highly overall correlated with NObeyesdad and 1 other fieldsHigh correlation
CAEC is highly imbalanced (58.1%)Imbalance
SMOKE is highly imbalanced (85.4%)Imbalance
SCC is highly imbalanced (73.3%)Imbalance
MTRANS is highly imbalanced (57.1%)Imbalance
FAF has 421 (19.9%) zerosZeros
TUE has 560 (26.5%) zerosZeros

Reproduction

Analysis started2026-02-28 14:18:12.132379
Analysis finished2026-02-28 14:18:34.611744
Duration22.48 seconds
Software versionydata-profiling vv4.18.1
Download configurationconfig.json

Variables

Gender
Categorical

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Male
1068 
Female
1043 

Length

Max length6
Median length4
Mean length4.9881573
Min length4

Characters and Unicode

Total characters10530
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Male1068
50.6%
Female1043
49.4%

Length

2026-02-28T16:18:34.740484image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-28T16:18:34.890829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
male1068
50.6%
female1043
49.4%

Most occurring characters

ValueCountFrequency (%)
e3154
30.0%
a2111
20.0%
l2111
20.0%
M1068
 
10.1%
F1043
 
9.9%
m1043
 
9.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)10530
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e3154
30.0%
a2111
20.0%
l2111
20.0%
M1068
 
10.1%
F1043
 
9.9%
m1043
 
9.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10530
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e3154
30.0%
a2111
20.0%
l2111
20.0%
M1068
 
10.1%
F1043
 
9.9%
m1043
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10530
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e3154
30.0%
a2111
20.0%
l2111
20.0%
M1068
 
10.1%
F1043
 
9.9%
m1043
 
9.9%

Age
Real number (ℝ)

Distinct40
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.315964
Minimum14
Maximum61
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:35.250135image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile18
Q120
median23
Q326
95-th percentile38
Maximum61
Range47
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.3570781
Coefficient of variation (CV)0.2614364
Kurtosis2.7985818
Mean24.315964
Median Absolute Deviation (MAD)3
Skewness1.5213261
Sum51331
Variance40.412442
MonotonicityNot monotonic
2026-02-28T16:18:35.658076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
21236
11.2%
23218
10.3%
26213
10.1%
18212
10.0%
19169
 
8.0%
22163
 
7.7%
20150
 
7.1%
2495
 
4.5%
2582
 
3.9%
1769
 
3.3%
Other values (30)504
23.9%
ValueCountFrequency (%)
141
 
< 0.1%
151
 
< 0.1%
1620
 
0.9%
1769
 
3.3%
18212
10.0%
19169
8.0%
20150
7.1%
21236
11.2%
22163
7.7%
23218
10.3%
ValueCountFrequency (%)
611
 
< 0.1%
561
 
< 0.1%
555
0.2%
521
 
< 0.1%
512
 
0.1%
481
 
< 0.1%
471
 
< 0.1%
462
 
0.1%
453
0.1%
446
0.3%

Height
Real number (ℝ)

High correlation 

Distinct51
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7016201
Minimum1.45
Maximum1.98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:36.065298image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1.45
5-th percentile1.55
Q11.63
median1.7
Q31.77
95-th percentile1.85
Maximum1.98
Range0.53
Interquartile range (IQR)0.14

Descriptive statistics

Standard deviation0.093368402
Coefficient of variation (CV)0.054870298
Kurtosis-0.56508573
Mean1.7016201
Median Absolute Deviation (MAD)0.07
Skewness-0.0091150723
Sum3592.12
Variance0.0087176584
MonotonicityNot monotonic
2026-02-28T16:18:36.395705image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.7125
 
5.9%
1.75122
 
5.8%
1.6296
 
4.5%
1.7696
 
4.5%
1.6588
 
4.2%
1.677
 
3.6%
1.7276
 
3.6%
1.6375
 
3.6%
1.7771
 
3.4%
1.7168
 
3.2%
Other values (41)1217
57.7%
ValueCountFrequency (%)
1.451
 
< 0.1%
1.461
 
< 0.1%
1.483
 
0.1%
1.493
 
0.1%
1.517
0.8%
1.5111
 
0.5%
1.5219
0.9%
1.5327
1.3%
1.5420
0.9%
1.5532
1.5%
ValueCountFrequency (%)
1.982
 
0.1%
1.951
 
< 0.1%
1.941
 
< 0.1%
1.934
 
0.2%
1.924
 
0.2%
1.9112
0.6%
1.97
 
0.3%
1.897
 
0.3%
1.8810
0.5%
1.8722
1.0%

Weight
Real number (ℝ)

High correlation 

Distinct1335
Distinct (%)63.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86.586035
Minimum39
Maximum173
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:36.586511image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum39
5-th percentile48.5
Q165.47
median83
Q3107.43
95-th percentile131.915
Maximum173
Range134
Interquartile range (IQR)41.96

Descriptive statistics

Standard deviation26.191163
Coefficient of variation (CV)0.30248715
Kurtosis-0.69988323
Mean86.586035
Median Absolute Deviation (MAD)21.74
Skewness0.25542213
Sum182783.12
Variance685.977
MonotonicityNot monotonic
2026-02-28T16:18:36.776618image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8059
 
2.8%
7043
 
2.0%
5042
 
2.0%
7540
 
1.9%
6037
 
1.8%
6526
 
1.2%
9023
 
1.1%
4222
 
1.0%
7819
 
0.9%
8519
 
0.9%
Other values (1325)1781
84.4%
ValueCountFrequency (%)
391
< 0.1%
39.11
< 0.1%
39.371
< 0.1%
39.71
< 0.1%
39.851
< 0.1%
401
< 0.1%
40.21
< 0.1%
40.341
< 0.1%
41.221
< 0.1%
41.271
< 0.1%
ValueCountFrequency (%)
1731
< 0.1%
165.061
< 0.1%
160.941
< 0.1%
160.641
< 0.1%
155.871
< 0.1%
155.241
< 0.1%
154.621
< 0.1%
153.961
< 0.1%
153.151
< 0.1%
152.721
< 0.1%

family_history_with_overweight
Boolean

High correlation 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
True
1726 
False
385 
ValueCountFrequency (%)
True1726
81.8%
False385
 
18.2%
2026-02-28T16:18:36.999928image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

FAVC
Boolean

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
True
1866 
False
245 
ValueCountFrequency (%)
True1866
88.4%
False245
 
11.6%
2026-02-28T16:18:37.134785image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

FCVC
Real number (ℝ)

Distinct180
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4189863
Minimum1
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:37.295236image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.52
Q12
median2.39
Q33
95-th percentile3
Maximum3
Range2
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.53399595
Coefficient of variation (CV)0.22075196
Kurtosis-0.63708303
Mean2.4189863
Median Absolute Deviation (MAD)0.39
Skewness-0.43305151
Sum5106.48
Variance0.28515167
MonotonicityNot monotonic
2026-02-28T16:18:37.496622image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3658
31.2%
2610
28.9%
134
 
1.6%
2.9714
 
0.7%
2.0512
 
0.6%
2.9412
 
0.6%
2.9112
 
0.6%
2.9211
 
0.5%
2.9611
 
0.5%
2.7711
 
0.5%
Other values (170)726
34.4%
ValueCountFrequency (%)
134
1.6%
1.012
 
0.1%
1.031
 
< 0.1%
1.042
 
0.1%
1.052
 
0.1%
1.062
 
0.1%
1.071
 
< 0.1%
1.083
 
0.1%
1.11
 
< 0.1%
1.111
 
< 0.1%
ValueCountFrequency (%)
3658
31.2%
2.994
 
0.2%
2.989
 
0.4%
2.9714
 
0.7%
2.9611
 
0.5%
2.959
 
0.4%
2.9412
 
0.6%
2.937
 
0.3%
2.9211
 
0.5%
2.9112
 
0.6%

NCP
Real number (ℝ)

Distinct256
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6856514
Minimum1
Maximum4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:37.695579image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12.66
median3
Q33
95-th percentile3.75
Maximum4
Range3
Interquartile range (IQR)0.34

Descriptive statistics

Standard deviation0.77807883
Coefficient of variation (CV)0.28971699
Kurtosis0.38542151
Mean2.6856514
Median Absolute Deviation (MAD)0
Skewness-1.1069503
Sum5669.41
Variance0.60540667
MonotonicityNot monotonic
2026-02-28T16:18:37.870045image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31208
57.2%
1205
 
9.7%
474
 
3.5%
2.9917
 
0.8%
2.9813
 
0.6%
2.979
 
0.4%
3.999
 
0.4%
2.888
 
0.4%
2.818
 
0.4%
2.667
 
0.3%
Other values (246)553
26.2%
ValueCountFrequency (%)
1205
9.7%
1.014
 
0.2%
1.022
 
0.1%
1.032
 
0.1%
1.041
 
< 0.1%
1.053
 
0.1%
1.062
 
0.1%
1.073
 
0.1%
1.085
 
0.2%
1.092
 
0.1%
ValueCountFrequency (%)
474
3.5%
3.999
 
0.4%
3.981
 
< 0.1%
3.971
 
< 0.1%
3.951
 
< 0.1%
3.941
 
< 0.1%
3.912
 
0.1%
3.92
 
0.1%
3.892
 
0.1%
3.881
 
< 0.1%

CAEC
Categorical

Imbalance 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Sometimes
1765 
Frequently
242 
Always
 
53
no
 
51

Length

Max length10
Median length9
Mean length8.8702037
Min length2

Characters and Unicode

Total characters18725
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSometimes
2nd rowSometimes
3rd rowSometimes
4th rowSometimes
5th rowSometimes

Common Values

ValueCountFrequency (%)
Sometimes1765
83.6%
Frequently242
 
11.5%
Always53
 
2.5%
no51
 
2.4%

Length

2026-02-28T16:18:38.120541image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-28T16:18:38.355739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
sometimes1765
83.6%
frequently242
 
11.5%
always53
 
2.5%
no51
 
2.4%

Most occurring characters

ValueCountFrequency (%)
e4014
21.4%
m3530
18.9%
t2007
10.7%
s1818
9.7%
o1816
9.7%
S1765
9.4%
i1765
9.4%
y295
 
1.6%
l295
 
1.6%
n293
 
1.6%
Other values (7)1127
 
6.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)18725
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e4014
21.4%
m3530
18.9%
t2007
10.7%
s1818
9.7%
o1816
9.7%
S1765
9.4%
i1765
9.4%
y295
 
1.6%
l295
 
1.6%
n293
 
1.6%
Other values (7)1127
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)18725
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e4014
21.4%
m3530
18.9%
t2007
10.7%
s1818
9.7%
o1816
9.7%
S1765
9.4%
i1765
9.4%
y295
 
1.6%
l295
 
1.6%
n293
 
1.6%
Other values (7)1127
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)18725
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e4014
21.4%
m3530
18.9%
t2007
10.7%
s1818
9.7%
o1816
9.7%
S1765
9.4%
i1765
9.4%
y295
 
1.6%
l295
 
1.6%
n293
 
1.6%
Other values (7)1127
 
6.0%

SMOKE
Boolean

Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
False
2067 
True
 
44
ValueCountFrequency (%)
False2067
97.9%
True44
 
2.1%
2026-02-28T16:18:38.715115image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

CH2O
Real number (ℝ)

Distinct201
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0080531
Minimum1
Maximum3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:39.093591image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11.585
median2
Q32.48
95-th percentile3
Maximum3
Range2
Interquartile range (IQR)0.895

Descriptive statistics

Standard deviation0.61294955
Coefficient of variation (CV)0.3052457
Kurtosis-0.87925382
Mean2.0080531
Median Absolute Deviation (MAD)0.45
Skewness-0.10502701
Sum4239
Variance0.37570716
MonotonicityNot monotonic
2026-02-28T16:18:39.563532image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2467
22.1%
1221
 
10.5%
3163
 
7.7%
2.1716
 
0.8%
2.0415
 
0.7%
2.6515
 
0.7%
2.1514
 
0.7%
2.0114
 
0.7%
2.3614
 
0.7%
2.0813
 
0.6%
Other values (191)1159
54.9%
ValueCountFrequency (%)
1221
10.5%
1.0110
 
0.5%
1.0211
 
0.5%
1.0313
 
0.6%
1.044
 
0.2%
1.056
 
0.3%
1.064
 
0.2%
1.074
 
0.2%
1.087
 
0.3%
1.092
 
0.1%
ValueCountFrequency (%)
3163
7.7%
2.997
 
0.3%
2.9810
 
0.5%
2.975
 
0.2%
2.966
 
0.3%
2.956
 
0.3%
2.943
 
0.1%
2.935
 
0.2%
2.923
 
0.1%
2.913
 
0.1%

SCC
Boolean

Imbalance 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 KiB
False
2015 
True
 
96
ValueCountFrequency (%)
False2015
95.5%
True96
 
4.5%
2026-02-28T16:18:39.970860image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

FAF
Real number (ℝ)

Zeros 

Distinct257
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0103126
Minimum0
Maximum3
Zeros421
Zeros (%)19.9%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:40.328848image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.125
median1
Q31.67
95-th percentile2.68
Maximum3
Range3
Interquartile range (IQR)1.545

Descriptive statistics

Standard deviation0.85061316
Coefficient of variation (CV)0.84193062
Kurtosis-0.62067528
Mean1.0103126
Median Absolute Deviation (MAD)0.8
Skewness0.49854641
Sum2132.77
Variance0.72354275
MonotonicityNot monotonic
2026-02-28T16:18:40.768709image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0421
 
19.9%
1243
 
11.5%
2199
 
9.4%
377
 
3.6%
0.9915
 
0.7%
0.0314
 
0.7%
0.0113
 
0.6%
0.1113
 
0.6%
1.9813
 
0.6%
0.9812
 
0.6%
Other values (247)1091
51.7%
ValueCountFrequency (%)
0421
19.9%
0.0113
 
0.6%
0.0210
 
0.5%
0.0314
 
0.7%
0.047
 
0.3%
0.058
 
0.4%
0.065
 
0.2%
0.0711
 
0.5%
0.082
 
0.1%
0.098
 
0.4%
ValueCountFrequency (%)
377
3.6%
2.971
 
< 0.1%
2.942
 
0.1%
2.931
 
< 0.1%
2.894
 
0.2%
2.883
 
0.1%
2.871
 
< 0.1%
2.851
 
< 0.1%
2.832
 
0.1%
2.821
 
< 0.1%

TUE
Real number (ℝ)

Zeros 

Distinct813
Distinct (%)38.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6578612
Minimum0
Maximum2
Zeros560
Zeros (%)26.5%
Negative0
Negative (%)0.0%
Memory size16.6 KiB
2026-02-28T16:18:41.222639image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.625
Q31
95-th percentile2
Maximum2
Range2
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.60892621
Coefficient of variation (CV)0.92561502
Kurtosis-0.54863476
Mean0.6578612
Median Absolute Deviation (MAD)0.485
Skewness0.61852392
Sum1388.745
Variance0.37079113
MonotonicityNot monotonic
2026-02-28T16:18:42.241150image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0560
26.5%
1292
 
13.8%
2109
 
5.2%
0.6316
 
0.3%
0.0026
 
0.3%
0.0974
 
0.2%
0.6734
 
0.2%
0.1284
 
0.2%
0.4134
 
0.2%
0.0034
 
0.2%
Other values (803)1118
53.0%
ValueCountFrequency (%)
0560
26.5%
0.0013
 
0.1%
0.0026
 
0.3%
0.0034
 
0.2%
0.0041
 
< 0.1%
0.0051
 
< 0.1%
0.0082
 
0.1%
0.0093
 
0.1%
0.011
 
< 0.1%
0.0111
 
< 0.1%
ValueCountFrequency (%)
2109
5.2%
1.9921
 
< 0.1%
1.9911
 
< 0.1%
1.9841
 
< 0.1%
1.9811
 
< 0.1%
1.9781
 
< 0.1%
1.9731
 
< 0.1%
1.9711
 
< 0.1%
1.971
 
< 0.1%
1.9671
 
< 0.1%

CALC
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Sometimes
1401 
no
639 
Frequently
 
70
Always
 
1

Length

Max length10
Median length9
Mean length6.9128375
Min length2

Characters and Unicode

Total characters14593
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowno
2nd rowSometimes
3rd rowFrequently
4th rowFrequently
5th rowSometimes

Common Values

ValueCountFrequency (%)
Sometimes1401
66.4%
no639
30.3%
Frequently70
 
3.3%
Always1
 
< 0.1%

Length

2026-02-28T16:18:42.726670image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-28T16:18:43.117892image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
sometimes1401
66.4%
no639
30.3%
frequently70
 
3.3%
always1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e2942
20.2%
m2802
19.2%
o2040
14.0%
t1471
10.1%
s1402
9.6%
S1401
9.6%
i1401
9.6%
n709
 
4.9%
y71
 
0.5%
l71
 
0.5%
Other values (7)283
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)14593
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e2942
20.2%
m2802
19.2%
o2040
14.0%
t1471
10.1%
s1402
9.6%
S1401
9.6%
i1401
9.6%
n709
 
4.9%
y71
 
0.5%
l71
 
0.5%
Other values (7)283
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)14593
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e2942
20.2%
m2802
19.2%
o2040
14.0%
t1471
10.1%
s1402
9.6%
S1401
9.6%
i1401
9.6%
n709
 
4.9%
y71
 
0.5%
l71
 
0.5%
Other values (7)283
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)14593
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e2942
20.2%
m2802
19.2%
o2040
14.0%
t1471
10.1%
s1402
9.6%
S1401
9.6%
i1401
9.6%
n709
 
4.9%
y71
 
0.5%
l71
 
0.5%
Other values (7)283
 
1.9%

MTRANS
Categorical

Imbalance 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Public_Transportation
1580 
Automobile
457 
Walking
 
56
Motorbike
 
11
Bike
 
7

Length

Max length21
Median length21
Mean length18.128375
Min length4

Characters and Unicode

Total characters38269
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPublic_Transportation
2nd rowPublic_Transportation
3rd rowPublic_Transportation
4th rowWalking
5th rowPublic_Transportation

Common Values

ValueCountFrequency (%)
Public_Transportation1580
74.8%
Automobile457
 
21.6%
Walking56
 
2.7%
Motorbike11
 
0.5%
Bike7
 
0.3%

Length

2026-02-28T16:18:43.508829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-28T16:18:43.900686image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
public_transportation1580
74.8%
automobile457
 
21.6%
walking56
 
2.7%
motorbike11
 
0.5%
bike7
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o4096
10.7%
i3691
 
9.6%
t3628
 
9.5%
a3216
 
8.4%
n3216
 
8.4%
r3171
 
8.3%
l2093
 
5.5%
b2048
 
5.4%
u2037
 
5.3%
P1580
 
4.1%
Other values (13)9493
24.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)38269
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o4096
10.7%
i3691
 
9.6%
t3628
 
9.5%
a3216
 
8.4%
n3216
 
8.4%
r3171
 
8.3%
l2093
 
5.5%
b2048
 
5.4%
u2037
 
5.3%
P1580
 
4.1%
Other values (13)9493
24.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)38269
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o4096
10.7%
i3691
 
9.6%
t3628
 
9.5%
a3216
 
8.4%
n3216
 
8.4%
r3171
 
8.3%
l2093
 
5.5%
b2048
 
5.4%
u2037
 
5.3%
P1580
 
4.1%
Other values (13)9493
24.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)38269
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o4096
10.7%
i3691
 
9.6%
t3628
 
9.5%
a3216
 
8.4%
n3216
 
8.4%
r3171
 
8.3%
l2093
 
5.5%
b2048
 
5.4%
u2037
 
5.3%
P1580
 
4.1%
Other values (13)9493
24.8%

NObeyesdad
Categorical

High correlation 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.6 KiB
Obesity_Type_I
351 
Obesity_Type_III
324 
Obesity_Type_II
297 
Overweight_Level_I
290 
Overweight_Level_II
290 
Other values (2)
559 

Length

Max length19
Median length16
Mean length16.192326
Min length13

Characters and Unicode

Total characters34182
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNormal_Weight
2nd rowNormal_Weight
3rd rowNormal_Weight
4th rowOverweight_Level_I
5th rowOverweight_Level_II

Common Values

ValueCountFrequency (%)
Obesity_Type_I351
16.6%
Obesity_Type_III324
15.3%
Obesity_Type_II297
14.1%
Overweight_Level_I290
13.7%
Overweight_Level_II290
13.7%
Normal_Weight287
13.6%
Insufficient_Weight272
12.9%

Length

2026-02-28T16:18:44.324305image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2026-02-28T16:18:44.734883image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
obesity_type_i351
16.6%
obesity_type_iii324
15.3%
obesity_type_ii297
14.1%
overweight_level_i290
13.7%
overweight_level_ii290
13.7%
normal_weight287
13.6%
insufficient_weight272
12.9%

Most occurring characters

ValueCountFrequency (%)
e5095
14.9%
_3663
 
10.7%
I3059
 
8.9%
i2655
 
7.8%
t2383
 
7.0%
y1944
 
5.7%
O1552
 
4.5%
s1244
 
3.6%
v1160
 
3.4%
g1139
 
3.3%
Other values (17)10288
30.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)34182
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e5095
14.9%
_3663
 
10.7%
I3059
 
8.9%
i2655
 
7.8%
t2383
 
7.0%
y1944
 
5.7%
O1552
 
4.5%
s1244
 
3.6%
v1160
 
3.4%
g1139
 
3.3%
Other values (17)10288
30.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)34182
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e5095
14.9%
_3663
 
10.7%
I3059
 
8.9%
i2655
 
7.8%
t2383
 
7.0%
y1944
 
5.7%
O1552
 
4.5%
s1244
 
3.6%
v1160
 
3.4%
g1139
 
3.3%
Other values (17)10288
30.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)34182
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e5095
14.9%
_3663
 
10.7%
I3059
 
8.9%
i2655
 
7.8%
t2383
 
7.0%
y1944
 
5.7%
O1552
 
4.5%
s1244
 
3.6%
v1160
 
3.4%
g1139
 
3.3%
Other values (17)10288
30.1%

Interactions

2026-02-28T16:18:30.731702image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:14.348890image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:16.670787image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:19.571726image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.097402image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:22.448935image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:25.622644image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:28.241931image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:31.032123image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:14.627379image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:16.949908image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:19.871962image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.236754image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:23.385293image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:25.915094image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:28.519983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:31.345760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:14.914027image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:17.192766image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.191877image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.377599image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:23.682019image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:26.214920image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:28.820616image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:31.677871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:15.166730image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:17.500569image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.331698image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.532975image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:23.993772image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:26.544493image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:29.113802image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:32.030388image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:15.498087image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:17.906999image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.482068image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.697891image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:24.326455image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:26.879231image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:29.471739image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:32.353143image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:15.779192image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:18.391527image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.615593image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:21.852943image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:24.636560image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:27.212606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:29.776097image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:32.710405image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:16.110407image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:18.798456image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.768689image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:22.011622image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:24.978437image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:27.557423image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:30.115760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:33.038378image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:16.359548image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:19.211147image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:20.945507image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:22.203871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:25.285184image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:27.880578image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2026-02-28T16:18:30.407494image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2026-02-28T16:18:45.155322image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
AgeCAECCALCCH2OFAFFAVCFCVCGenderHeightMTRANSNCPNObeyesdadSCCSMOKETUEWeightfamily_history_with_overweight
Age1.0000.1540.1640.008-0.2100.1370.0580.189-0.0030.348-0.1050.2920.1320.185-0.2980.3570.235
CAEC0.1541.0000.0980.1810.1150.1930.1290.1310.1570.0950.1670.3520.1600.0460.1320.3180.349
CALC0.1640.0981.0000.1070.1120.1370.1020.0330.0990.0950.1210.2250.0550.1040.1380.2190.012
CH2O0.0080.1810.1071.0000.1570.1910.0660.2330.2230.0880.0680.2290.1300.0740.0230.2230.228
FAF-0.2100.1150.1120.1571.0000.1520.0280.2650.3260.1140.1440.2120.0950.0670.050-0.0440.158
FAVC0.1370.1930.1370.1910.1521.0000.0910.0600.2210.2010.0320.3280.1860.0400.1700.2930.205
FCVC0.0580.1290.1020.0660.0280.0911.0000.349-0.0540.1030.0820.2930.0950.000-0.0860.2100.118
Gender0.1890.1310.0330.2330.2650.0600.3491.0000.6220.1620.1620.5560.0980.0350.1300.3960.099
Height-0.0030.1570.0990.2230.3260.221-0.0540.6221.0000.0900.2070.2060.1730.1620.0790.4620.301
MTRANS0.3480.0950.0950.0880.1140.2010.1030.1620.0901.0000.0420.1790.0700.0000.1260.1400.118
NCP-0.1050.1670.1210.0680.1440.0320.0820.1620.2070.0421.0000.2440.0450.0270.0850.0050.189
NObeyesdad0.2920.3520.2250.2290.2120.3280.2930.5560.2060.1790.2441.0000.2350.1110.2160.5750.540
SCC0.1320.1600.0550.1300.0950.1860.0950.0980.1730.0700.0450.2351.0000.0330.1280.2350.181
SMOKE0.1850.0460.1040.0740.0670.0400.0000.0350.1620.0000.0270.1110.0331.0000.0580.1290.000
TUE-0.2980.1320.1380.0230.0500.170-0.0860.1300.0790.1260.0850.2160.1280.0581.000-0.0500.187
Weight0.3570.3180.2190.223-0.0440.2930.2100.3960.4620.1400.0050.5750.2350.129-0.0501.0000.557
family_history_with_overweight0.2350.3490.0120.2280.1580.2050.1180.0990.3010.1180.1890.5400.1810.0000.1870.5571.000

Missing values

2026-02-28T16:18:33.560054image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2026-02-28T16:18:34.337416image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad
0Female211.6264.0yesno2.03.0Sometimesno2.0no0.01.0noPublic_TransportationNormal_Weight
1Female211.5256.0yesno3.03.0Sometimesyes3.0yes3.00.0SometimesPublic_TransportationNormal_Weight
2Male231.8077.0yesno2.03.0Sometimesno2.0no2.01.0FrequentlyPublic_TransportationNormal_Weight
3Male271.8087.0nono3.03.0Sometimesno2.0no2.00.0FrequentlyWalkingOverweight_Level_I
4Male221.7889.8nono2.01.0Sometimesno2.0no0.00.0SometimesPublic_TransportationOverweight_Level_II
5Male291.6253.0noyes2.03.0Sometimesno2.0no0.00.0SometimesAutomobileNormal_Weight
6Female231.5055.0yesyes3.03.0Sometimesno2.0no1.00.0SometimesMotorbikeNormal_Weight
7Male221.6453.0nono2.03.0Sometimesno2.0no3.00.0SometimesPublic_TransportationNormal_Weight
8Male241.7864.0yesyes3.03.0Sometimesno2.0no1.01.0FrequentlyPublic_TransportationNormal_Weight
9Male221.7268.0yesyes2.03.0Sometimesno2.0no1.01.0noPublic_TransportationNormal_Weight
GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad
2101Female261.63107.22yesyes3.03.0Sometimesno2.49no0.070.456SometimesPublic_TransportationObesity_Type_III
2102Female261.63108.11yesyes3.03.0Sometimesno2.32no0.050.413SometimesPublic_TransportationObesity_Type_III
2103Female211.72133.03yesyes3.03.0Sometimesno1.65no1.540.912SometimesPublic_TransportationObesity_Type_III
2104Female221.73133.04yesyes3.03.0Sometimesno1.61no1.510.931SometimesPublic_TransportationObesity_Type_III
2105Female211.73131.34yesyes3.03.0Sometimesno1.80no1.730.898SometimesPublic_TransportationObesity_Type_III
2106Female211.71131.41yesyes3.03.0Sometimesno1.73no1.680.906SometimesPublic_TransportationObesity_Type_III
2107Female221.75133.74yesyes3.03.0Sometimesno2.01no1.340.599SometimesPublic_TransportationObesity_Type_III
2108Female231.75133.69yesyes3.03.0Sometimesno2.05no1.410.646SometimesPublic_TransportationObesity_Type_III
2109Female241.74133.35yesyes3.03.0Sometimesno2.85no1.140.586SometimesPublic_TransportationObesity_Type_III
2110Female241.74133.47yesyes3.03.0Sometimesno2.86no1.030.714SometimesPublic_TransportationObesity_Type_III

Duplicate rows

Most frequently occurring

GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad# duplicates
7Male211.6270.0noyes2.01.0nono3.0no1.00.0SometimesPublic_TransportationOverweight_Level_I15
3Female211.5242.0noyes3.01.0Frequentlyno1.0no0.00.0SometimesPublic_TransportationInsufficient_Weight4
0Female161.6658.0nono2.01.0Sometimesno1.0no0.01.0noWalkingNormal_Weight2
1Female181.6255.0yesyes2.03.0Frequentlyno1.0no1.01.0noPublic_TransportationNormal_Weight2
2Female211.5242.0nono3.01.0Frequentlyno1.0no0.00.0SometimesPublic_TransportationInsufficient_Weight2
4Female221.6965.0yesyes2.03.0Sometimesno2.0no1.01.0SometimesPublic_TransportationNormal_Weight2
5Female251.5755.0noyes2.01.0Sometimesno2.0no2.00.0SometimesPublic_TransportationNormal_Weight2
6Male181.7253.0yesyes2.03.0Sometimesno2.0no0.02.0SometimesPublic_TransportationInsufficient_Weight2
8Male221.7475.0yesyes3.03.0Frequentlyno1.0no1.00.0noAutomobileNormal_Weight2